
Generative Artificial intelligence is the buzzword used for anything. When I mean anything, I mean anything. Companies and organizations that are not involved in technology or do not need technology to run their day-to-day activities now find a way to integrate technology just to go along with the global trend. God is so good, GenAI provided the means for companies to integrate AI into their businesses.

Before the advent of GenAI, of course, there has been Artificial intelligence that encapsulates discriminative models of conventional machine learning and deep neural networks. The machine learning models include logistic regression, support vector machine, K-nearest neighbor, Naive Bayes, decision tree, and many others. Also, the deep neural network models can be categorized into:
1. ANN: Artificial Neural Network
2. CNN: Convolutional Neural Network
3. RNN: Recurrent Neural Network
4. LSTM: Long Short-Term Memory
5. GAN: Generative Adversarial Network
The ANN is a feed-forward neural network with input, hidden, and output layers. It's the simplest form of deep learning model, sometimes called multilayer perceptron. The ANN model can work with comma-separated values datasets or Excel format data. Moving forward, here comes the CNN model. The CNN model also has the input, hidden, and output layers but is made up of some additional components which include, convolution, pooling, normalization, and many others. However, this type of deep learning can solve classification and regression problems. CNN also works with images or frames from videos. This brings us to the sequence-to-sequence model, RNN. The RNN model is basically for text-to-text model prediction. This model's limitation is that it can only handle smaller words or text because of its small parameter and memory. This limitation birthed what is known as the LSTM model. This model however can handle short and longer words or text when compared to the RNN model. As the world evolved it appeared that the fixed length of words/text for both input and output from the model became a challenging problem in the world of Natural Language. Thus, the ground-breaking research in the year 2014, sequence to sequence learning.
The Seq-Seq learning that brought about the idea of Encoder and Decoder with the use of context vector between them helped to solve the fixed length problem in the LSTM and RNN model. However, this method can not be done without the use of LSTM, RNN, and GAN. The encoding part of the model is done by any of the three models same as the decoding part of the seq-seq model. Now the fixed length problem was solved, but remember the longer a text the harder its comprehension. This also applies to humans. I can remember the comprehension passage in companies like KPMG and the like which I find difficult to comprehend because I'm not familiar with the subject and the long passage, Gosh! It's exhausting. I perform woefully most times. What I’m trying to say in essence is that AI models have encountered this problem in the past before the groundbreaking GenAI we know today. So, the Encoder-Decoder with context vector couldn’t comprehend longer sentences. Auspiciously we have ATTENTION IS ALL YOU NEED.

Attention is all you need is a ground-breaking research that birthed transformer model architecture which is currently enjoyed in today's world of Natural language processing. Also, this architecture makes use of an Encoder and a Decoder but without the use of LSTM, RNN, and GAN. This serves as a milestone for the world of GenAI. This research study was dominated by Google employees, the University of Toronto, and one I. Polosukhin, up to eight authors.
Today we have so many companies that use transformer model architecture with little tweaking, one of which is the popular OpenAI with their GPT (Generative Pre-trained Transformer model). ChatGPT, talk of the town!!!
It doesn’t end here so many companies have jumped on this train, with some remarkable groundbreaking research in the field of Generative AI. Stay with me as I unfold more about GenAI….